Hi, my name is Abigail Elliott and I'm a rising senior at Natick High School just outside of Boston, MA. In my free time I love to play with my dog Speckles, play guitar, run, and hike the White Mountains. At school I manage audio design for our drama department, and am involved with peer tutoring, Syrian Relief Club, and participate in Math League. Although my dad jokes about me becoming a rodeo clown, my true interest lies in math and science. While I'm not yet sure what path of STEM I want to pursue, I love biology and am looking forward to a great summer!


thing.jpg






Research Project: Personal Genomics and Gene Sequencing Technology

It is becoming increasingly popular to sequence entire genomes, both of humans and other species. Advances in sequencing technologies have made this possible by reducing the price, time, and labor required for such processes.

Genomic Sequencing Trends:
The first human genome sequenced was finished in 2003 with the Human Genome Project. This project took about 10 years and cost around 2.7 billion dollars, primarily using the Sanger method of sequencing. Since this project, the number of human genomes sequenced annually has increased rapidly. The diagram below shows the increase in human genome sequencing projects between the years of 1998 and 2014. Since 2014, these numbers have risen in accordance to similar trends as genome sequencing becomes more commercialized and available for public use.
external image 8XLsa68_Lp0yH_yLp-aWfxpgU1V7tRhry0N6IfIElW-bWgfjWGi-ikr3M8jRPqVSwjPTF_gL0B9vDrX7LE1ONb5Kyvgy0N0pvGRKI1lseRrtab0CF7zvJAvFjConVn56iKmg6wdR
Coinciding with an increase in genome sequencing, there has been a decrease in the cost and time investment required in sequencing individual genomes. This is due to the rapidly advancing technology that genome sequencing is done by, as well as bioinformatic software becoming more efficient and easy to use for scientists, data analysis, and, now, consumers. In the graphic below, the cost trend of individual genome sequencing is visible. Moore’s Law refers to the projections made by Gordon Moore, a co-founder of Intel, which show the predicted cost trend of individual genome sequencing in relation to increased production.
external image MNqX_ItbG3f3U4V68um9mC0Y82jr6xzVb1vyEuvdqOg0zm4wfqKA3e-oqENCK7doE13mFqgxuIk4sutOoHk2vumyoYYn7shBp9DVHnEwRdNXHFsuM3Bip243ZrMIkC6rPHhcQpVk


The decreasing cost of genome sequencing is the result of technological advancements of sequencing processes, like Next Generation Sequencing.


Next Generation Sequencing:
Next Generation Sequencing (or NGS) is the blanket term for modern sequencing techniques. Methods that fall under this phrase include Illumina (solera) sequencing, Roche 454 sequencing, ion torrent: Proton/PGM sequencing, SOLiD sequencing, and many more sequencing processes. Since the completion of the Human Genome Project in 2003, researchers have sought better and fasting sequencing technologies. The Sanger Method used in the project was slow, expensive, labor and time intensive, and required the use of radioactive materials. Over the course of the following decade, sequencing technologies evolved to a faster, more accurate, and less expensive form called Next Generation Sequencing.

NGS is typically completed in three steps: library preparation, amplification, and sequencing.

In the step of library preparation, the DNA strand is prepared for the amplification and sequencing processes. First the DNA is broken apart into fragments that are easier for sequencing technology to work with. Sequencing can only occur (with most techniques) when the DNA strand is split into many pieces, longer strands take more time and the sequence will not be as accurate. Then adapters are added to the end of the fragments, resulting in DNA fragments having a “sticky end”, or a end that easily bonds with new DNA fragments. To prevent the recombination of DNA strands, a phosphate molecule is removed from the “sticky” end of each fragment, transforming a 5’-P end into a 5’-OH end.


external image F_KLQLJI_XPNThu6hwwHS90iGSN9bHLyOf9zxZ2dZYHnwyNSmkvfB2bLbqOg_vFIFFKqF3Zogk0aX5iY4Ahaf68ng4i12ffbuLSOaNsNEqT954f0KIoejuwqbuNBuYchoSBWrdhB




Amplification then occurs. There are two widely used processes of amplification, emulsion PCR (polymerase chain reaction), and bridge PCR. The emulsion PCR method is time consuming and often inefficient, so the bridge PCR method is sometimes prefered. In Bridge PCR, fragmented DNA strands and RNA primers are placed on the surface of a flow cell. When enzymes including DNA polymerase and nucleotides are added, the DNA fragments attract to primers resting on the bottom of the flow cell, and bend down to attach to them. This creates a “bridge-like” structure of the DNA fragments. The fragments are then replicated in this form, so when one end detaches from the flow cell, identical fragments are close together on the surface. Many copies of the fragments are made and the DNA strands appear in “clusters” on the surface off the flow cell, making it easy to collect similar strains for sequencing.
external image AXq5UbIQmzT95wph86tzIQuoqlTOcmJF2Dmb2TJbjxOVP7Z_u2iXnkuSff2jb6AiZjfvBXiCdRjkuMDnBLBGegRbvaucklLCXB8AEeFl6wDp6pIs_C4exVSVCtFnCA3DsvGW1m0z





The last step in the Next Generation Sequencing process is sequencing. Sequencing methods vary between companies and providers, and many are based off of the original Sanger method of sequencing. While there are many more methods commonly used, three of the most popular are listed below.
  • Roche 454 System
    • This system was the first NGS to be released following the Human Genome Project. This program detects a release of a pyrophosphate molecule following the attachment of a dNTP nucleotide rather than stopping DNA synthesis at this point. The pyrophosphate released generates visible light correlating to one of the four nucleotides. One of the advantages of this system is that is is very fast, sequencing an entire genome in about 10 hours, though may have more errors than other processes.
  • Illumina GA/HiSeq. System
    • The Illumina system uses bridge PCR for amplification and uses fluorescent signals in its sequencing technique. The nucleotides ddATP, ddGTP, ddCTP, and ddTTP are added to the strand in a “sequencing by synthesis” (or SBS) method, each nucleotide having a different colored dye that is released and then detected by the system. Since its release in 2007, this techniques has been one of the least expensive (the HiFreq 2000 method has a cost of $.02 per 1 million base pairs).
  • Ion PGM from ion torrent
    • This method of sequencing was the first to not use fluorescence, but rather a pH detector. When a specific dNTP nucleotide base is added, a positive hydrogen ion is released. A pH sensor recognizes the resulting change in pH. If there are multiple of the same nucleotides in a row, the change in pH will correspond in magnitude. While this method is fast and less expensive than many of its competitors, is can be inaccurate when repetitive sequences are added. A visual of the ion torrent technique is below.
external image Qp5DOjigPQ5JODHNKEnlkiS1QfpxD-g6LOqm5jKzGLqnXS9nw18uzLkYbUfVx_CBUdW2eWS9euKxOG4YO6OHFrqBt2bLt0mYsZ_cgBmVLtCGvAceCatKg3dOdCmuC6mc8VMZb6JP


Modern Analysis of the Genome:
There is still much that a DNA sequence cannot reveal. Personal genomics is a fairly new scientific discipline and has only been available to the public for about a decade. Today, the most popular motivation for individual genome sequencing is genomic ancestry and healthcare, and these processes are very limited. As sequencing and analysis technologies advance, potential for individual genome sequencing will likely grow.

Ancestral Gene Testing Potential:
Besides sequencing the entire genome, other genomic ancestry tests that exist include the Y-DNA test and the mitochondrial DNA test. The Y-DNA test can only be performed on biological males, as it analyzes the DNA of the Y chromosome. This test links paternal heritage and can be compared to find relatives in one's paternal side. Due to the popular custom in heterosexual marriages for the female to take the male's last name, this test can show history of surnames. This test in particular is often used in conjunction with ancestral research because surnames can traced back to geographic regions or large amount of lineage. The image below illustrates the Y-DNA test following the paternal line of ancestry. As visible as the grey squares, there are many ancestors that cannot be identified by this test, showing the limitations of the Y-DNA test.
external image qLYHIh7C80QgOD-1F6GitK_yYUIl5iJlmXGfxyhY1VumLFtKJtHzTefBs7Cs6aYiNLVVGsSFEO9Txoz990Wcahe1LDRDrQbR23B6hG_F4C_yodCSvps4I4JaHrjcE4eclsD3vGV4


The mitochondrial DNA test traces the maternal ancestral line, as mitochondrial DNA is passed down from mother to child. This test can be done on any person. The mitochondrial DNA sequence only has 16,569 base pairs, and being fairly short is generally inexpensive to sequence and therefore inexpensive for the consumer. Like the Y-DNA test, this tests can be compared between people to find common ancestry in the maternal line. If a mtDNA sequence is identical, the two people likely share a common ancestor one to fifty generations prior.
external image 7z95Ddt3neGj6TPagBeWCAyWiN_lNQ-qavwqB1O1kJwUTTqXggc7F9Sjhc2z5eB7p_iiQ3L6Rd0enXesH3qnXFedaqiDI9ywms8XtW5Fi4-qJA0K3ysnBJu7A1eOX-fqfHOStgVF


These two tests are very common DNA tests because they are readily available and generally inexpensive. However, they only sequence a small part of your genome. The image below demonstrates the limitations of the combination of these tests. Full genome sequencing by new technologies have revolutionized genomic ancestry analysis by allowing researchers to identify much more information about one’s genetic history then by previous means.
external image bVNdSsczdAeSyBNIgjACveT2n1rLijVoAH7UY8wroRVKt0scbeMr-AZm_Jeetd0eEb2F3YeHIa87chN2nKhPIHl45ZxliTZ-_15xyyGrO1q4mTEuwXFUjGpQq4S9009pAcuXQ-os

Clinical Potential:
There is enormous potential of full genome sequencing both today and in the near future. DNA sequencing allows scientists to analyze mutations and changes in an individual's DNA that may increase risk for disease if other conditions. By evaluating DNA, scientists can gain knowledge about diseases that kill millions each year and possibly develop treatments and preventative measures. In the future, genome sequencing will likely be very common and millions of sequenced genomes will be on record. The more information available, the more researchers can learn from DNA sequences through comparison. An emerging field named “personalized medicine” allows diagnosis and treatment to be catered to an individual patient rather than to a group. By analyzing the DNA of a patient, a doctor could learn advanced knowledge of reactions to procedures or be helpful in diagnostics. Moreover, those with increased risk to certain disorders could take preventative steps.
external image IVaG_hhH3lLOI1rhZl_XIFp70a9_ehSf6RWUTyoFeOZN19qkRyOeRB0Mbd1liB4A7BMHv8jCUW7ReduF84zUQ3cwGD-e4OIAndO4mo0zDTddIRc6oB5_FC5OpF-g-yBFVBsSX87i


Genetic Discrimination and Protective Legislation:
Below are some of the current legislation that exist to protect people’s medical and genetic information.
  • HIPAA: Health Insurance Portability and Accountability Act: Protects confidential health information and limits who health information and records can be released to. In 2013 this act was revisited to establish that genetic information (including DNA sequences) are considered medical information and therefore are protected under the act.
  • GINA: Genetic Information Nondiscrimination Act: The newest protective act (passed in 2008), this legislation protects people’s genetic information from being released to the public. Additionally, it ensures that companies and public services cannot discriminate against groups or individuals based on their genetic codes.
  • Common Rule: Passed in 1991, the Federal Policy for the Protection of Human Subjects, or “common rule” requires that all research projects get consent from human subjects before the research is done. The policy mandates that human subjects give “informed consent.”

These legislative we're passes to ensure that health information is protected from the public and to prevent “genetic discrimination.” Only 20% of people and 55% of doctors had heard of the GINA Act, a 2014 study revealed (Personal Genomics Education Project). Genetic discrimination refers to unfair treatment based on one’s genetic makeup and predisposed conditions visible in sequenced DNA.

A new bill called HR 1313 is moving through congress that has the potential to override the rights defined in the GINA legislation. This bill would grant employers the right to require genetic testing and be able to view the results, undermining privacy rights gained by the GINA and HIPAA. The bill has not been passed.

Many caution that genome sequencing may have implication in information security and privacy. To approach this impending problem, some scientists are taking measures to set up resources for scientific research while still satisfying individual rights of patients and subjects. Two professors from the University of Alabama and the University of North Carolina received grants from the National Science Foundation to begin an education initiative regarding genetic information storage and privacy rights. They aim to design a “web-based tool” allowing researchers to have access to large amounts of anonymous genomic data while still maintaining security for those with health records on file. As the amount of gene sequences available grows, privacy rights will become a larger issue. Maintaining communication between subjects, doctors, and companies as well as federal and state protective legislation is essential.

Works Cited :
Begley, Sharon. "House GOP Would Let Employers Demand Workers' Genetic Test Results." Statnews.
STAT, 10 Mar. 2017. Web. 12 July 2017.
"Benefits and Implications of Learning about Your DNA." PgEd. Personal Genetic Education Project, n.d.
Web. 12 July 2017.
Branam, Chris. "Protecting Identities in a Sea of Big Data." ScienceDaily. ScienceDaily, 23 Sept. 2015.
Web. 12 July 2017.
"Next Generation Sequencing." ATDBio. ATDBio, n.d. Web. 12 July 2017.
Stein, Rob. "Routine DNA Sequencing May Be Helpful And Not As Scary As Feared."National Public Radio.
NPR, 26 June 2017. Web. 12 July 2017.
"The Cost of Sequencing a Human Genome." National Human Genome Research Institute (NHGRI). NIH, 6
July 2017. Web. 12 July 2017.
"Understanding Genetic Ancestry Testing." Molecular and Cultural Evolution Lab. University of College
London, n.d. Web. 12 July 2017.
"What Is Genetic Discrimination?" PgEd. Personal Genetic Education Project, n.d. Web. 12 July 2017.


Link to Presentation